The Munich Feature Enhancement Approach to the 2nd Chime Challenge Using Blstm Recurrent Neural Networks

نویسندگان

  • Felix Weninger
  • Jürgen Geiger
  • Martin Wöllmer
  • Björn Schuller
  • Gerhard Rigoll
چکیده

We present a highly efficient, data-based method for monaural feature enhancement targeted at automatic speech recognition (ASR) in reverberant environments with highly non-stationary noise. Our approach is based on bidirectional Long Short-Term Memory recurrent neural networks trained to map noise corrupted features to clean features. In extensive test runs, enhanced features are evaluated with gradually refined recognition back-ends, reaching from simple maximum likelihood (ML) trained recognisers to state-of-the-art ASR using discriminative training and model adaptation techniques. In the result, consistent improvements over the baseline ASR systems on both the small and medium vocabulary tasks of the 2nd CHiME Speech Separation and Recognition Challenge demonstrate the efficacy of the proposed method, achieving up to 52 % relative reduction of word error rate with respect to the multi-condition ML training baselines.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Tum+tut+kul Approach to the 2nd Chime Challenge: Multi-stream Asr Exploiting Blstm Networks and Sparse Nmf

We present our joint contribution to the 2nd CHiME Speech Separation and Recognition Challenge. Our system combines speech enhancement by supervised sparse non-negative matrix factorisation (NMF) with a multi-stream speech recognition system. In addition to a conventional MFCC HMM recogniser, predictions by a bidirectional Long Short-Term Memory recurrent neural network (BLSTM-RNN) and from non...

متن کامل

The Munich 2011 CHiME Challenge Contribution: NMF-BLSTM Speech Enhancement and Recognition for Reverberated Multisource Environments

We present the Munich contribution to the PASCAL ‘CHiME’ Speech Separation and Recognition Challenge: Our approach combines source separation by supervised convolutive non-negative matrix factorisation (NMF) with our tandem recogniser that augments acoustic features by word predictions of a Long Short-Term Memory recurrent neural network in a multi-stream Hidden Markov Model. The performance of...

متن کامل

Feature enhancement by deep LSTM networks for ASR in reverberant multisource environments

This article investigates speech feature enhancement based on deep bidirectional recurrent neural networks. The Long Short-Term Memory (LSTM) architecture is used to exploit a self-learnt amount of temporal context in learning the correspondences of noisy and reverberant with undistorted speech features. The resulting networks are applied to feature enhancement in the context of the 2013 2nd Co...

متن کامل

The ICSTM+TUM+UP Approach to the 3rd CHIME Challenge: Single-Channel LSTM Speech Enhancement with Multi-Channel Correlation Shaping Dereverberation and LSTM Language Models

This paper presents our contribution to the 3rd CHiME Speech Separation and Recognition Challenge. Our system uses Bidirectional Long Short-Term Memory (BLSTM) Recurrent Neural Networks (RNNs) for Single-channel Speech Enhancement (SSE). Networks are trained to predict clean speech as well as noise features from noisy speech features. In addition, the system applies two methods of dereverberati...

متن کامل

Combining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary Noise

We address the speaker independent automatic recognition of spontaneous speech in highly variable noise by applying semisupervised sparse non-negative matrix factorization (NMF) for speech enhancement coupled with our recently proposed frontend utilizing bottleneck (BN) features generated by a bidirectional Long Short-Term Memory (BLSTM) recurrent neural network. In our evaluation, we unite the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013